Bit '144. 602 
* lOTffOR 

POB DATE 
NOTE 

? BDRS PRICE 
' DESCRIPTORS " 



^ "IDENTIFIERS 



DOCOHBHT SESOHE 

' 4 U < ; '/ IB 005 230 

Itood', "Lewis Jay 

Computer Assisted Test Construction in the BYB 
Library School. 
Aug 76 

38p. . ' 
HF-$0.83 HC-52.06 Pl[us Postage* 

♦Computer Assisted Instruction; Iteifc Banks; Library 
Education; .Post Testing; Pretesting; *Test 
Construction; *Testing; Tests 
Brigham Young University OT •<> 



ABSTRACT % - ' * 

Computer assisted test construction (CATC) is a new 
testing technique that seems to provide ease and flexibility for 
faculty members and students. The purpose of this paper was to verify 
that student test scores* are not adversely affected by implementation 
of CATC* Two sections of a. basic course in cataloging were tested for 
one semester. One section of 30 students was used as* a control group 
while the other 20 students served as the experimental group. The 
students were pre-tested and then received class in struction. , During 
the course of the semester, unit exams were? administered* Exams f or ' * 
the control group were prepared by the course instructor* The 
experimental group 9 s exams were** compiled of v random questions selected 
by the .computer from the item bank. The analysis of -scores from a^ 
post^test confirmed the hypothesis there would be ng detrimental 
effect* upon tes^ scores for students tested witfi the CATC system. The 
pte-test used in the- study is included in the appendix. , 
,( Author /JAB) - ' 



IERJC. 



U $, DEPARTMENT OFWEALTH; c 

E-oycAridN* welfare 

"NATIONAL INSTITUTE OF 
' , EDUCATION 

-THIS DOCUMENT MAS ' BEEN R EPRO- 
OUCED. EXACTLY AS RECEIVED FROM 
THE PERSON OR ORGANIZATION ORIGIN- 
ATING IT PptNTS OF VIEW OR OPINIONS 
STATED DO NOT NECESSARILY REPRE- 
SENT OFFICIAL-NATlONAt. INSTITUTE OF 
EDUCATION POSITION OR POLICY 



COMPUTER ASSISTED JESf CONSTRUCTION 
:h IN THE BYU LIBRARY SCHOOL 



4 



by . 

LEWIS JAY (BUD) WOOD 



"PERMISSION TO REPRC^&UCE THIS 
MATERIAL HAS BEEN GBANTEO BY 



Lewis Jav Wood 



9 A Research Paper 

Submitted to The , 
School of Library and Information Sciences 
Brighaui Young University 
. ' ; Provo, Utah 



TO THE EOUCATIONAL RESOURCES* 
INFORMATION CENTER (ER4C) ANO 
USERS OF THE ERIC SYSTEM," 



/ 9 



•In' Partial Fulfillment 
of the "Requirements Wf 

L.irS. 697 * 

August, 1976 



ABSTRACT ; ' 

* 

* • *# . . 

• •* 

* t>* • i> 

The' problem treated in-this paper was one of deter- 
.'•'.-•.« • . 

mination Df^ef feet" upon student test. stores caused*. by impTe- 

mentation of a computer asSi'sted te.st construction (*CATC) 
system.* . ' ' * • 

. ♦ 

Two ""sections/ of l.I.S. 528, ^ basic' course in cataloging, 
were tested for one Sie.itlest'er . 5ne section was used "as a control 
group. The other served for* e'xperime-ntal purposes/ .The 
students were pre-tested and then 'each received class instruction 

* * * 

During the course of the semester,, unit exams^ were* admi n i s tered 
to eval-uate individual progress. At the conclusion of the 
semester, the students were post-tested. 

- It was found that, students in both sections attempted** 
to memorize th.e unit exams.'-, S;tudervts in the control section 
were' successful ; while those in the experimental section were 
unable to do so. . ' * , 

' ' The post-test analysis confirmed the hypothesis ^ 

that there woul\d be -no detremental effect upon test scores 
for stude-nts tested with /the CAtC > system . 



PREFACE .* • . 

» * > • 

Th^i s paper" is written, not only wi th the intent of 

fulfiMjng requirements far US. 697, but- 3V.so in hop.es that 

it, or a variation -of> it, might Be u'&ed'jjy the Brigham 

Young. University (BYU,)" Department of Instructional Evaluation 

and Testing to help promote the use of computer assisted test 

construction (CATC) at the University. ' 

The -author, who serves as manager bf Testjng SerV4ces 

takes ful> responsibility for the 1 "Statements and judgments 

made (jerein, as they are based 4 on hfs experi ence^ i a Testing 

Services\ind his efforts in supportV^f CATC at' BYU sine-e ? ' 

1971.. Gratitude is : felt for the h'Mp^nd assi/tarice^ of Dr.v 

Adrian .Vanftondf rans , director of tli? BYU fiep v artm6^t v of 

Instructional Evaluation and X'es t i ng who, peYmf tfcetl *me tfme 

off work to complete my* degree; to. Or .^erl e Larason , of the 

BYU School of Library and' Information Sc!i ervejes^ for use oj * 

his LIS- 528 classes; to the students of LjSvKS, Wfnter*^. 

^Semester, 1 976, for their willing (or unwilling) frelp; ^nd # - 

; ■ * • / 

to my wife, Elaine, and' sons, Michael, Majr*k 3 'Ha than , S-cott , 
and Wendell, for four years 'of enduring wWte Dad went back 
to 'school. - * 



'A 



ERJC 



\ 

t 

A 

/ 

TABLE OF CONTENTS _ ! /' 

Pag0 

PREFACE . . . .' . . .u . '. ' - . i I • . . ii 

LIST OF TABLES. ./ v 

Chapter 

' 41 / • 

' -1 . INTRODUCTION ' * , . 1 * 

\ 

statement of the problem „. . 1 > 

HYPOTHESIS 1 

DEFINITIONS . 'l 

L&m . . . i 

Ueed. ' {■'• 

. \^ tem Bank .......... 2 " 

Foij 2 • 

Generation ' . . . 2 « 

t , ^ 

*Test Generator. : ~ *. . 2 

?.\CATC s'ySTEMS--A BACKGROUND. . . ./.' ...... ^ 3 

[TEM -jUttK DEVELOPMENT AND' TEST DESIGN: 
' \ METHODOLOGY* '.*. . . . J ......... 8 

\ \THE COtJTROL* GROUP . . . . . . ; . . ; 1 0 - , 

x THE EXPERIMENTAL .GROUP; . ... ........... 11 

.-ANALYSIS OF' THE TWO GROUPS — METHODOLOGY- . ; . .... '12 

4.' . ANALYSTS OF L.I.S."52-8 TEST DATA. . ". . '.' . J 5 

PRE-Ti'ST RESULTS JS-^ : 

UNIT EXA*MS-'-TEST RESULTS. . '. . . - ; .' ] ; " 

( / ' Retakes- of Unit; Exams ".' ... ^vV"'. \ ? % 1, ' 

- C • ' - , 

- ' ' . * • 



• • » r 

" * r 

* f ■ * A " , ' * 

• * 1 • / »* « 

1 • * 1 ,- 

° ^j 3ga ^^ M ^_. L^^^^MuT.tj.pO.e-JbB.taJc.ei^of- Urri^t -Exams * . . /V." N ^.' T8 "Vr"'~- 

-Unit Exam Summary, v 5 . . . .18 

\ • ' POST-TEST ANALYSIS 5 AND PRE-POST COMPARISONS. 19* 

5. CONCLUSIONS AND SUGGESTIONS. 21 

PRE-TEST' RESULTS . . . . <-.t . • 7 21 

. UNIT EXAM RESULTS! . .\ f . .... «... . '}) 21' 

\ . „ PO,ST-TEST RESULTS . ." . ... . , . '. . .' P . . 22 ' 

SUMMARY AND SUGGESTIONS FOR FURTHER STUDY'. . . 22 \ 

APPENDIX 24- . 

/ SOURCES CONSULTED \ . • '. . ,; 30 




* - , - --- -- -» . -\ ... »■ 

: -v" * - . ' , •• ' ■ ■ • *\ 

. ' LIST bF TABLES ■ , • ' " • • • 

* * * ' 

Table ' • " . . p a 'ge 

1. Pre-te^t. ResuVts . .. t .......... ^ , 15 

- 2. Unit Exa,m Results. ..>.".<-,.■/.'••*.... 16* 

-3. ' Percentage^ of Students. Retaking Bxams. . ; . 17 

4; Multiple Retakes, ,cyf ^Exams. ....... . ^ • • • " \8. 

\ 5. Pre-pqs t~test Cqmp'ari sons . s . 20, 

o 



4 Ifc 



*7 



• Chapter 1 * ... 

' r v INTRODUCTION ; 

Statement erf the Problem 

Computer, assisted test construction ('CATC), when . ' 
t coupl.ed_with an out-of-cUss testing program, is a relatively 
new testing technique that seems to provide ease and flexi- 
bility for the faculty member and the student. *It is i/npor- 

• " * 

tant, however,* before this program becomes widely used, to 
verify the assumption that student test scores are not 
adversely affected by its implementation. Such" veri f i citi on 
is the probLenu trea.ted by thi s paper. 

tiypothesi s ' • , - ^ . 

' ' The hypothesis tested in the fo 11 wirfg- pages is' 
that^test scores from students jnvol ved i n an' out-of -cTas s • ' 

f 

CATC testing_F£nogram at Brigham You^g Uni vers ity (BYU ) will * 



not be sigmficarftly lower than;'scores from students tested 

. " ' '"" • \ 

•with traditional "teacher-made" examinations. 

' . ' . • ' - h' . 

"Def ini t i o n s * ' 

) — : - \ 

Certain, terms, as 1 explained below, have specific 
^Bnd unique meanings as they are used in this paper. Such 
terms are defined'as follows: , 1 >*_ - ' , 

^ : item- An itejn is a te'st quktion> either vfritWen by 
a faculty member /or made up by a computer,' to be included in 



an exami na'tton • t " . ... ^ 
• Seed . A seed is a quests-attain skeletal form that 
provides a framework u'pton which a.gomputer will build to 
make up a-test question. For exampl e^ " 



was the '_ President of the United States." would be 

,a.seed that could be completed by the insertion of the words 

* m — j- . i 

"Abraham tin-coin" and "sixteenth"" to make a true-false 
q u e s f .i o n . ' * . , 

Item bank . An item bank* is a collection of test 
questions of sjseds, in machine readable form, from which 
items a,re selected^to assemble a'student test. - J 

, - - 0 

Foi 1 . "A/foil* is a-n alternative answer to an item. 
Given the question, "Who was th£ sixteenth U.S. President?", 
and the alternatives ".George Wasf^i ^-ton , " ."Abraham Lincoln," 
andJ'John Kennedy," these three alternatives would be foils. 

Generati on . The act of assembl i ng_i terns*- to create 
a student exam -is called generation. 

Test generator . A computer program that *assembl es 
test questions to" create a test is a test generator. 



Chapter 2' 



CATC SYSTEMS--A BACKGROUND 



- - 1 



. A fairly recent development at Brigham Young University 

*. * ' 
is the development -of .-a computer assisted test construction 

facility administered --by 'the University's Department .of 

Instructional Evaluation and Testing.- BY'UJ s CATC "programs 

.1'n.voTve the computer i'r> tes^j^onstructi on and analysis -as' 

well as in grading and recordkeeping to.assis\ the faculty 

member in his task of student evaluation. 

• f , 

According "to Gerald Li.pp'ey, manager of Advanced 

\ . . . 

Instructional .Applications Development in the D.ata .Processing 
Division of IBM, "The most noteworthy changes Iri -educational 
testing, during the past few decade's have been those-, which 
resulted fr_o°m technol ogical .progress. 1,1 The late 1940's and 
1 9 1 s saw the a-dvent of machines developed td quickly and' ' 
efficiently score student examinations. Since that time,- 
machi ne scoring technology has advanced bo that several -thou- 
sand exams can now be scored eac*h hour by a single,, relatively 
•untrained, person.- As the scoring is done^ data. can be 

* * 

gathered to provide Somewhat sophisticated . statistical fflea- 
sures of not only the exam, tak.en° as a. wh.ole,,but also of ; 
each alternative ?Tr foiT on each question- of the exam. 

I — • ' ■' • ' ' 

1 * • 

'Gerald Lippey, ed., Compute-r Assisted Test Construe- - 

tion (Englewood Cliffs, New Jersey: Educational TechnoTogy 

Publications, 1974), p. 3. « * . 

' 3 • • - - ' 



ID 



. i Computer assisted test construction came to the 

• * #* 

Brigham Young University camp'us in 1971, when Tes-ting Services 
(officially, the Department of Instructional Evaluation and 
Testing) developed an initial test generator program to 

select items randomly from a'rvitem bank of questions ' provided 

• J - 

i)y the History department.- In the past fo'ur' years, .CATC 

testing at BYU has grown from a yearly, total" of- 2,000 exams / 

administered, to % projected 1975-1976 academic year total •■ 

of 175,000 examinations. 2 '> 

Exactly' what is CATC? Computer assisted test con- i> 

' J 
struction* has many facets, yet there are several commonalities 

amongst all systems as they have developed in the United^ - ' 
« ' ,v • 

States; The systems include an item bank for examination 

purposes. In mo-s t «cases the bank itself is stored in the 

computer, although at times, only indices to items," or. item 

• * 

seeds are machine readable.. ' fN x 

Most of the items in existing CATC systems afe 
objective. However, they need -not .be. BYU's. test generator _ 
program can handle, not only" objective' questions (of which 
true-false are sjnrply * subset), but -also matching, short 

* * • > 

answer, and essay questions. CATC involves, in every case, 

the. use of a computer to'select items from' the 'i tern bank. 

The algorithm f or S£l ec ti on is generally quite simple in 

• * ' ' f . • «te • 

tha,t items are basically classified before selection, by ; 



P a * 

'Lewis' J. Mo-od, "Request for Approyal,of Computer 
Hardware Procurement" (Provo, 'Utah: Brigham Young Univer- 
sity Department* of Instructional Evaluation and Testinq, 
1 975*9, p., 4. ' • * . * 

11 



subject and/or other measures such. aVdifficiNty or grade" 
level. The instructor then sets parameters to be used'by 
the 'generator program, indicating test, length , subject 
Composition, difficulty, etc,^ 

' CATC especially when linked to an out-of-class 
tes^g program, such, as that offered at BYU, "seems- to offer 
several advantages over a conventional testing program. 

s 

Use^of'the computer -to gather item response information and 

stati sti cal measures on e,ach foil allows for continual item 

improvement, for example. 
*> * • 

Becaus-e item randomization; virtu'aljy assures test 
security by preventing "leaks",. ,of test forms, out-af -cl as s 
testing becomes practical, -Such testing procedures provide 6 

increased lecture time within the classroom. For the student, 

■* • * 

* *" « 

out-of-class testing provides flexibil ity -to ,h{fs schedule, 
aH'Owi-ng" him to avoid "hearvy" days wherein he/woul d normally 
nhave several exams'. 1 . t . * 

' "The 'proper development of item banks and their „ 
increasigg usage provide an opportunity for sharing of items 

among -several faculty 'members . Whi le this pra-ctiCe is 

1 

possible without a machine readable item bank, the existence • 

v ' ' 

of the tfcipk makes it pract i cal 'and easy, to share questions 

of potential worth kmong faculty members. ^ 

t# Cost is atpther major advantage .rn a wel 1 -designed- 
* * 

CATC system where multip-Te forms /of several equivalent 
examination! can be quickly, easily/ and inexpensively % - 
generate-d. "Computer costs per printed test las ; t year * 



(at BYlQ totaled $0.4334. . \ -. "^'BrighanT Young University's 
CATC program ;has been designed so that fourteen to' sixteen 
students^use the same form df each elam. This sacrifices" 
some test security, but also'cuts costs. The 0 per-,student 

test generation cos«t to t.he department . 1 ast yean was only" 

"* ' ' '4 - x ' . ' < • • 

$.0.0271. \\n a recent study performed by the author for* \. 

the Department of Instructional Evaluation and Testing, 

the .cost of generating student ex«mi»n'ati oity administration' 

of the tests, scoring, and-.reporting back cumulative grade's 

•t^ 1shV facul ty Was found to be j$0. 38.18 pe$».s tudent per test. 

This compares favorably to. the "y*. 72 per student per- test 

cost found for 'conventional test preparation, administration, 

scoring, .and recording.. v = 

In'order to -take advantage, then, of the CATC*. program , 

wha*t steps must -be taken? The' first sEelp is thr~telectio.n-or 

development of a test generator program. Points .tiff consider 



hefe include minimization of cos ts>, 'qri teria for item identi- 
fication and selection, file organization, and item response, 
•feedback for question evaluation. Each of these questions * 
has t0 ^be answered by the systems analyst who w'ilj des^n 



■?k£i£_J> ••J^Wbod,* "Establ ishing a CATC System: Where 
to Begin" (paper- read at t-he;<Second Annual Conference on 
Computer Assisted Test Construe ti ori',*Atl an ta Georg ia , Octo- 
ber 13, 1 975) , p. 5- * . _ 

^ 1 bid . ' ' " 

5 Lewis J. 'Wood, " P-rel fini nary Analysis of Costs and 
Revenues f or* Kodul ar Testing"* (Provo, Utah: Brigham Young 
University Department of Instructional Evaluation a.nd Test- 
ing ,. 1 975)', p. 5. 

. » 



the test genera-tor. Inasmuch as the JJYU program \s already ^ 

written J' i t now* is sufficient to merrti on^ these points and 

then pass on to the particular problem >at hand: the - 

de»vel opmerft Vf an item bank to be uSecl in L'KS. .528., the $ 

*'\ • •.. 

cataloging course in the BVU School -of Library and Information 
Sciences.- 



14 . 



/ - . v» 

'ITEM B7\NK DEVELOPMENT ATIO JEST DESIGN: 'METHODOLOGY? 



Be.fore an adequate item bank could'be developed for "~ 
L'.'I.S. 528, an' item analysis*had to be run on the items previ- 
ously used "in the' class 1 s ' fi ve unit examinations. This 
analysis' provided not only frequency response information on J * 
each item, foil*, but also a po'int-bf serial correlation for°that 
foil. The i tern ^ana lysi # program then used tha t' correl ati'on 
figure to perform^ question evaluajn^n, which rated the item's 
discrimioation record "f rom "A," {a- discriminating item) to 
^E" (an ambiguous ojie)! Based upon this question evaluation, 

the item pooj 'was, culled of all "D" and "E' 1 questions before 

• * . * *' 

it v^s put into the machine readable format. 7 * ; 

-Because of the desi gn* speci f icat.ions of TESTGEN*, 

'thte test generator program in use at'B-Yl); the i-tems for the 

pool were .keypunched to eighty CQlumrr cards. The first 

four columns of each card, as well' as th'e las't five ^columns r 

of the >ast card, in the* .question were- used f oj jcantrol and 

identification purposes^ ThQ item coul.d be <of any Tength, 

> however, 'as -any number of cards>could bemused for.tfve question 

text. As a backup, to be used in*cases*of extreme machine 

efror, the entire i tern- ban k. was als/> stored on tape as well 

as on di:sc. T f>e d i f i 1 e was the actual "generation file 11 

from which the items*were randomly selected. * • ^ 



•T] . The Ittm pool for use in LIS. 528-was strictly 
multiple choice in format. Not only* i s • thi s the trd-di tional 

.'approach to unit exams for this course, it is a^iso the e>si'est 

- - » *•* s 

type of question for Which toquantify test Ve'siil ts , Conse- ' ' 

/ 

quently, no change . -in item- format was made. However, 
using a technique first reported by Denney in but 
independently developed at BYU in' 1971, the jnufttlple' choice 
quest! ons, had a unique difference. Several incorrect 
responses were loaded into the item bank for ^ach, question 
and the generator program randomly selecte.d not only *the 
item, to be used on the' examination, but also the responses 
tor be printed with that" item.' Thus, while two forms of the 
exam, could contain. the same, item, the foil's to -thaf item 
CQuld be unique. '* . '• ' • 

' One problem of parti cu]a> si gni/f icance.''>n the devel- 
opinent of any machtne-readabl e- i tem bajnk i's ,t:hat of determining 
the size plF..the bank. In other words , how" many questions are 
needed in an item bank?.. "Many theore'ti cal\a'nd practical 

1 v J • t 

factors are involved in the final decis io'nC ' The number of ♦ 
ttem's must 'be a'^quate to cover the subject matter.. . . ." 2 . - ; 
Lippey judged that about fifty Items. ir£[ needed -per, cl ass - 
hour of presented material. Dona-ld,^J,enfien, professor of 
psychology, at the' Up i ver^s i ty of UebrfsHa, postulated that 



■ * 



C f flenney, Wore to /a* Test, Pool than Data 

Collection," E d fte,a t i o n a 1; .Te c h njb 1 o g'y- ; ■ 13 (1 973), 1 9-20 

2 ' N '• * ' ' ! : '.' •/. I ' 

. Gerald Lippey, ed., Compu'te'rl Assi ste'd Test Construe- 

»tion_ (Englewood Cliffs, New Jersey: / Educational Technology 
Publications, 1974), "p. 48-. 



about ' teru.times as'ma'ny relevant items are needed in the 

' -' " 

bank as<will appear on each tes/t. It is, interesting to note, 

however,' that the two methods yield somewhat similar results. 

Jensen administered aBout ejght examinations in a semester, 

using* a total«pf 2^400 items. Lippey's rule applied to a * 

similar course^ meeting forty-five time's -yields 2,25*0 items. 3 

Fo.r the above reasorvs, it was origi na 1 T>Kp.l anned to 
■» ' .| - •' • / 

apply the- t.ATC^approach to, the LIS 528. >qui zzfifcjfra titer than^ 

"to the unit ej/ams.. Inasmuch as this was not possible, the 

'• < '.. v 

•availa-ble items from the unit exams were used'. -These items " 

totaled only approximately twice .the number of Hems to be 

* • 

selefcted for eachTtest generated. / 

Once; the item pool was prepared and edited, it *' 
then became possible to set up te^hvg^programs for. two 
separate sections of LIS.. 528, to perform pre&test 1 analysi s , 
and to run the s tudents. through* the programs, using one sec- 
tion for control .and' one. for experimental pu-rposes. The.* 
following paragraphs explain how the -program worked . 

The Control Group • • 

The students in the control group attended class 
in the traditional manner. Weekly quizzes on cataloging, 1 
written by the class instructor., .w'ere administered in .the 
modular test center located in the Grant Building on the 
BYU campus. The center was open from 8:00 AM until 6:00 PM 
on'Vlondays; 8:00 AH until 8:00' PM, Tuesdays ..through Fridays; 
and 9:0,0 AM untij 1:00 PM on Saturdays. Each >of the fiy.e, unit 

— ' : — - , . . *• 

3 Ibid ., p. 49. . ... * , • 

1 ( ' < 



exams., also written by the class instructor, 'was > in a like 

manner, administered. Each quiz, and unit exanrhad \to be 

passed with a score' of 8f percent o/'-abbve. Failure to do 

so necessitated a student retaking the test, the retake 

quiz was made up of questions similar to those'.pn the firs't, 
■ . - * ; t 

qutz. The retake exam, as has been tradi tionaTV was made' up 

of\he same items, printed in a different ordfer^v To keep 

t|^4|t whicii version, .Af an*£xa#*0i§2^ 

student was issued, a testing card by\ Testing Service's. ' As 

soon as' the student finished his testy it was * f h.a.nd-scoreci ' 

,and he was then able to-review the mater ial . on Hhe test tq ' 

see where mistakes were^made. + " N * 

The- Experimental Group * ^ " x 

. : * :The stydents tn the ^experimental gro^also attertded 
class in' the .traditional fashion.. Jh ; e same, weekly quizzes . 
admi qi s'tered -to the control section were al so, admi n'i s tered 
to students in the, experimental' group. m Methods of t£St 
administration were the same for both, groups as .were modular 
ce'ht^ hours. * , • 

T-he experimental^ ;secti on was also' administered five 
urfit exams, but these e*ams were compos ad of. items randomly 



'is 

selected from the item, po.ol. As in t+i^e xontrol group, each' 
sfetfdent had to retake a quiz*or exam if lie failed to attain- 
a minimum^ of 81 percent.- Retakes., -of quizzes"' for the experi- 
mental class were conducted the same as for s the control 'class 
Retakes of-'thejjnit exams, however, were d~i f f erent inasmuch 
as ttte rQtefke-exam was another randomly selected -(instead of 

* * * * J < 

18 



simply re-ordered) test. Competed' wi th the U'tuden t ' s first % 

version of the exam, some iteijis v/ere- expected to be' the same; 

/ ■ 'A- ; - 

others would/ be the same question with. a different set. of 

/ ■ • . \ \ * °- - . 

'alternatives'; while, other items would 'be cojnp-Vetely different. 

Item order Was also randomized, in that* the f i rst • questi on 

on form 11 A" was not necessari ly \he Op t question on. .form' 

"B" > if, in fact, it appeared Qt> form*."B" at all. "Testing 

Services personnel* recorded which'forms of^ the quizzes and 

unit exams were taken by students in !the experimental 'group 

by again using a testing card;'as ttr the ccise of the control * 

.group. In order to' separate the two : groups and to insure the 

proper examination was^givea to each student, the- testing 

cards were colorrcoded. 

As s*oon as the student in the experimental 'section , 

finished his>aper, he took it to the control clerk where 

it was 'hand-scored. He v^as then able, to look over his exam 

and determine where he made his mistakes. 



Analyst's -of the Two Groups~-Methodol ogjy * ^ 

: ^Assuming a no.rma/1 di stri buti oii? of # students as they. ^ 
enrolled at the beginning of, the semester, the students in - ^ 
the^ control and* experimental sections of L.I. 5. '528 'were . 
thought to beequally "unknowing." To Verify this assumption,' 
a pre-test, made up of questions selected from the unit exams 
ajidj^eeJUy^tfui.zzqs , was written by.<*the author of this paper r ' 
A £rid is included in the appendix. This phe-test'was administered 
to .students /in Both sections. A t L test was run against the 
Scores to ver.ify that no' difference existed at the 0.01 level 



13 

'Of significance . . ' \' ' ■[■-'' > ' 

« At. the-* conclusion of the semester, the' same pre-test 
was _ again 'a.dmini ste.red and\ these "post" test results were 
measured for statistical Significance. Gains in both groups 
lVpre-. vs.. post-testing were anticipated, but\xpections 
'^concerning .differences of post-test scores" between' the two, 
group's were 'not known. Higher post-test, experimental scores 
or approximately equal post-test scores between the two 
groups .'should be indicative of a significant advantage of the 
.CATC or experimental approach over the'.tradl tional testing - " 
t due ,to ease of testing and grading for the f a'culiy'tnember. 
Significantly 1 ower~ post-test experimental [scores should, on • 
thfe other hand, indicate the experimental approach had'been 
detrimental to the students ' /earning processes. 

Due to the required minimum passing score of 8l"perceift 
for both sections, t- tests -were hot run on the unit exam scores 
but the hypothesis was made concerning these scores that the 
students- in the control section would simply "memorize" the 
tes«ts, while those in the experimental section would. not be - 
able, to-do so. Assuming this to be the case, the author 
anticipated that the highest mean scores for the. experimental" ' 
section would be lower than comparable scares for the contnol 
section, while; the standard deviation .(s.d.) of these scores. . 
for the experimental se-ction wo u I'd be greater. In other words-, 
since the students in the experimental section were unable to. 
memprize the tests, their scores 'would be lower and the. curve 
would be more spread out than the curve of the control; section ' ■ 



, If, as has been hypothesized, students have attempted 
* * * * * • 

to memorize the unit^oxams, it would also'be expected that 

the greatest- number of "multiple retakes tth'ird or fourth 

.attempts'at passing}, would opcur in tlie experimental Section. 

on* Test 1, where 't^he students anticipated ^returning to retake 

, . t > , * * \ 

the same exam., -As it became apparent that item memorization- 

would no.t be helpful, increased study previous to takjng 

an exam, would become more appropriate and, consequently, 

^t* might be seen that total retakes for the experiments V 

-section would drop to less than that of the con trol . secti on 

as the 4es ti hrj continued thro.ugh the, semester . 6 




c. 



21 



<0 




-, ^ . * Chapter 4" 

' - ANALYSIS OF L';.S. 528 •TESP^-WTA* 

. - • • • . / * ' a i* ' ' < 

Pre-t'est Re'su lts '" 1 • * I \ * ' • 

Approximately f if tyr'f-i.ve student's enrolled' in L,I,S. 528 

•'f^r.yin^er Semester, 1976, b.ut '10 percent/dropped theelas'S 

before it; actual ly began. "Of the fifty students Remaining 

• '•• ' ' • *•♦.". * 

who began instruction and actual ly' took the* pre-te§t, thirty 
' <■ „ . • " * , . '• V 

were in Section V^Knd were -designated a-s- thfe. control grjrflp. 

The remaining twenty students enrol 1 fed \nn S_ect'iony$0 and 

constituted the experimental groiu>".o'f students'. <"«*;. 

The't-test results of the prertest can^Se seen, in 

'Table 1, below.- . * , ' ~ , 

■ Table v. ;. y 

: > * Pre-tes t Resul ts * 



Group 




Mean / 


■ i. 


, k . D. - S.ignif icance - 

« 


Control 


30 , 


12.833 


■ *t x 


437 . • ' '» 


Ex per i mepta 1 


20 


1 2. 300 




00,3 


m Cross 
' £ 


, Test 


Signif icance^ 


None 



As can be', seen, whij£ the mean*' "score, of the control 

- » ' * . 

group exceeded that of* the e xper imen taKsecti on , the observer! 
difference was not considepd to be- v statis1$ca1 ly;signff icant 

^15, • 



Th.is resuVt verifies the. assumption that -the two sections* 
started the course at the same point. In-e-ther words, v the * 
; tw*TO groups were., in fact, equally "unknowing." 

Unit Ex'aitis --Test Results . * • 

k ' * 

\ ■ . As can be seen from an examination of Tabl^e 2, 

# « , j 

•analysis 'of the data collected from the unit exams' i'-s inter- . 

• • • » 

*sti-ng , — < — 0 _ " / 

Table..2 . { 

• Uni t Exam Resul ts 



Control Group 



Test 


First Attempt 


Highest' Attempt 


Average 
Retake 

G*ai n 


"f ' 
• He an 

*> ^ 

. t 


s:d. 


■■■■ Mean" " 


~ S.D. 


1 


75.88 


11 .97 


90.49 i 


6..78 


• 28,. 1 9 


2 


7 7 . 4*4 


• T3.24 


93\l!f 


5.76 


28.27 


'• 3,"; 


80.22 

* 


9 

9.66 


90.82 - 


6.36 


. 23.86" ' 


4 


"83.65 - 


8.31 


88.46 


• -7.-31 


25.00 ' 


5 


85.63 * 


9.22 


. 90.25 ,~ 


_4.78 


23.13 


* 













Experimen ta 1 group 





70.25 


•12.30 * 


88.25* j_ 


7.95^, 


28.50 


2 


73. 7 1 . 


8'. 35 


84.75 , 


5.ao 


* 22.00 




'77.25 


1 £ 30 


84.75 . 


6.00 


18.75 


3- - 
















82.00 

> 


12.90 


89. 50 


. 8.00 


. 15.00 


. ' 5 ' 


84.25 ' 


8.55 


85. 75-**"" 




25.'00 7 -> / 7 


Ct — 1 











— ~- . " .17 

In fo.ur of the five cases, the experimental students 

scored lower, after retaking the unit exams, than, did' the 

control students. Also, in four of the five cases, the s.d. 

of the control gr.6up wVs less than that' of the experimental 

section. In three of the cases, thre average g,ain over initial 

scores obta-ined v by control students as they retook their 

exams exceeded the gain-made by experimental students. In 

on6 case, thi i jLi If f er.&nce. Was -a fufl.l $ ten points i lm the < < 

two cases where this situation reversed itself, the experi 

menfal gai n .over 'the control gain was less than twp' points.' 

(In fact, on Test -V, the experimental gain was less than one> 

third of one- point better th^n the control gain.) 
i ■ > 

* Retakes of- Unit Exams' . ' The .percentage of students 

retaking exa.ms was qui^e tomparabl e f or four of the five 

-'.* ' * . . ' ' * » "* • * " . 

unit exams. Only on Test 5 did the experimental students 

drastically 'differ in the number of retakes. Table 3 

shows, this data. . 

\ * ' 
•Table" 3 

% ..Percentage of % Student^ Retaking Exams • 



Jest — Contr : oT Group' Experimental Group 

1 " .56. • , 57; • " 

" 2 • 56 , 48 1 

/ 3 : . . ' i 44 - . * 43 {. 

-.5 I 19 KX. " 22'. 



18- 



. Multiple Retakes of Unit -Exams. -Multiple retakes 

•of unit e.xams. were significantly different for the- two groups* 

Only 1.03 percent of the totaUexams taken by the Control 

group were multiple retakes ,1 whi 1 e 5.37 percent of .the' expert- 

meYftal exams fell iftto'this category/ Fifty percent of all 

experimental nful tip 1 e retakes occure^d on Test 1, bijt this 

number declined steadily until none of the s-tu'dents in the 

exper.i men ta I section had to take/the fifth test more than 

twice. ' The^se retakes, as a percent of the number of students' . 

initially taking the tests and of .the'actual numbers of 

multiple retakes recorded are^ shown in Table 4. 

, -* " * 

-Table 4 

^ ' ' ' ' 

tluftiple Retakes of Exams % t 





Control Group (n-270 


' Experimental 


Group (n^20) 


Test 










f of... 




* of 




• % of Class 


Retakes' 


% of Class 


■ Retakes* 


1 


3.7 




, 

>2Tr. o< 


4 


2 


O.D 

•> 


' 0 ' ' 


10.0 


2 


• • "3 - 

• 


o.b . . 


"o ' 


"5.0 














\ 


0.0 


. 0 


* • 5.0 ' -o 


i • 


5 


3.7 




0.0 















• Unit Exam Summary. In summary; the analysis of un'it^ 

exams scores shows slightly higher retake means for-the 
\ • • • A 

contrQl section than foY the experimental group. Conversely^ 
experimental -s*d.'s are smaller. Retake percentages tended 



25' 



to be j quite close, until'jest 5, where the experimental group- 
as a whole retook the exam less* than ha.l f the number of times 
that the control, group did. Analy-sis of the multiple retakes 
o%f exams shows this happened' a Imos t five times as frequently 
(,o'n a-pe.r capita basis), in the experimental class- as frt the 

control section. Exactly one-half the experimental multiple 

>■ > 

retakes- took place on the first unit exam. " 

Post-test Analysis and Pre-pbs.t t?onrpar i sons . * 

^ ' The sample sizes in the control and experimental 

groi/p£ decline^ during the semester to where only 41 

students took t*e post-test. 'Seventeen of these, were in 

the experimental sec,t?on and the remaining 24 were in the 

control group. * • ^ 

As- Cc|n be>seen in Table 5, there was a c onsid erable 

gain for each section- over its pre-test scores." T-t^st 

analysis of this gain showed it to be sfgn^Vtcant at the . 

.001 level of ^confidence* A sJjsAA iar .analysis of post-test > 

mea-n-s-TJetweeV the control and experimental sections showed 

there to b"e no significant di fference>, between -the two group's. 

A close examination of th\ mean gain for each section on the 

po's.t-test over its- pre-te.st score, shows th-e students in'the 

experimental section outscoijed their counterparts in the V 

control group by 0.849 points. An analysis of cdvariance; 

•was' run in an attempt to determine if thi\s more sensitive test 

could find significance in 'this gain, but none existed at the 
. . * A 

' . 0* level -x, - o • ' 



Pre-post-test Comparisons. " 



— -i 

Group 


u 

Pre 
Mean 


Pre 
S.D. 


• Post 
Mean 


Post 
S.D^ 


Mean 
6a i n ' 


Signifi- 
• cance • 


Control 
Experimental 


12/833 
1 2 . 300 


2.437- 
2,. 003 


19.625 
.19.941 


4.332 
4.235 


6.792 
7.641 


.001 
.001 



<A summary tabulation of, the pre-post-test^'data con- , 
slifdes tha't the. two groups started together and, while mean 
gains in pgst- test scores, oyer scores on the pre-tes"t were 
significant at the .001 f%el , the two groups -had no signi- 
ficantly-different learning experiences. This is in spite of 
the observed difference in mean gain scores where the experi- 



mental aection did better/. 



>0 



Chapter 5 

• * 

* , y 

- CONCLUSIONS AND SUGGESTIONS *.* 

' . * * * 

Analy.sis of the, data collected has proven to b,e 

fluite concl us-i ve . as follows. 



Pre-test Results 



The t-test showed- tlj,e .two groups to be equally 
* "un*(iowing" at the beginning of the semester. Thus, no 
adjustment of test scores had to be. made in. order to make 
post-test comparisons. • ' ' - 

Uni.t Exam Resul ts • ■ ■ 

# As anti cipated ,^re take scores from the control 

section were Jiigher and more closely clustered, than were 

t*he same scones from ttfe experimental section. This, ^ 

coupled with the retake pattern's of. the two groups and the * 

excessi ve' mul tip! e retakes an Test^ 1 for the experimental 

s.ectiQn, provides conclusive evidence, in the author's 

opinion, of attempted 'tes t memorization on the part of 

"both groups. In a^ like manner*, the data suggest , the 

experimental approach of random item selection to be fairly 

effective 'in combating this a ttempi , r wi th resul tant increased 

* , * *• * j. * ~ 

study on the part : of those students in ttie experimental 

• secti on . . • V / " 



21 



0 - i.v ■ 7 ^ Z8~X~~r 



22 



Post-test >Resu1 ts& . ' 

The experimental section showed, more 'gain ^fST^, 

* pre- test** scores than did the control secti on\.^bu t this gain 

• was not significant at the .01 level/ The fact that the~ 
experimental studen'ts did at Jj^ast as well as- the control 
students is significant, however, in that it;' verifies th§^ 
primary* hypothesis of this paper thajr scores : ,of students 
involved with ,CATC testing should not be .lower than t/he 
scores of students not so involved. 





Summary and Suggestions for Further 

: ' y j 

The data gathered and presented^in this "paper 

,/ ^ K . . / 

presents a strong cas^, in the author's opinion, for CATC 

testing. 1 The tests are easier for the instructor/ t v o- produce, 

after the data bank is prepared, than are gon vent/i onal< exams*. 

TfTey also seem * to r havei some advantages over the conventional 

exam if that exam- is, to be retaken to measure a /student's 

• 1 earn i ng' and growth. Test memorization becomes/ imprac ti cal 

in such a CATC environment and this, in and of/itself, can 
^ ' • . / .— 

Ij^rtTH^ increased study. Test security also is less of*a 

problem, as students flnd.no reascin to pass questions to their 

p eej>*> . ^ * 

More importantly from tha. viewpoint /of the class , ' : 

instructor, the data suggest there is no de/crease in the 

f • / • / 

'student's 1 e-arn i^|, due to CATC testing. On/the contrary, it 
might just provide the stimulus for further/ study and a 
resultant enri chjejd/ 1 earning experience.' 

Think^g of this^paper as a pilot /to? guide those 



;) ■ 29 



— . . 23 

to/f'oTlow, expansion of the project so as to in vol ve' several 
hundred students over multiple semesters in an extensive 

* * 

evaluation; of the impact of CATC testing might be fruitful. 
Preliminary analysis of ihis impact indicates -the 'emergence 
Qf a "new tool, which may. prove- to be quite, useful in enrichii 
the educational' process for students involved in its use. 




APPENDIX 




•GRADUATE ScAoO-L CP LIBRARY & INFORMATION SCIENCES 

5J8 ; . m NOT WRITE QN THIS TEST 

Organization and Processing ' M. Lamson 

of Materials , "Winter, 197'6 



PRE- AS SESSMENT INSTRUMENT 

Please, code your name and social, security number in the" appropriate 
boxes on th6 answer sheet,' 

* GOOD luck: c *| 

« 

^1 . «,What is the bes.t way to catalog maps? 1 " - " \ 

a) use LC r I* , , 

b ) use AACR * . i " 

c) use American fc^g gr aphi.ca 1 Society 

d) there is. no one "best" way ^^J^ 

e) none of the above 

2. Which publication seems to be the only one dealing '^ith county 
and Municipal items? , 

* ** 

i o 

a) montj^by catalog * ^ 

b) checklist of State Documents ' 

c) 'PAIS \ . . • 

d) Municipal yearbook . - 

e) none of the above 



What is the major probl-em in cataloging music with generic titles? 

a) establishing th^standard title - 

b) no particular majok problem 

c) establishing the composer 

d) none of the abo ve-- t/Tere is a problem, but it's not h&^re 

What 1 is the difference between^ a full score and a miniature score? 

a) none * 

b) size t 

c) use % — 

d) non6 of ; these--there is a difference, but it''s /not here 




The use of color-coded cards is Recommended, for .use^tr cataloging 
media, 

a) true . % ( ^ 

^b) false 



t ERIC 



- 25^ 

J ' * 32 



Lvj-. T- 1 "."", ' "r*: - *• „ 



What is subject cataloging? 

a) Subject cataloging deals with making 'l;he entice catalog cVrd 

b) Subject cataloging deals' only .with establishing the main entry 

c) Subject cataloging deals only with deciding upon which subject 
headings to use t m 

' * 

What are tine two major categories, of biography? " 

a) collective. - individual ' * : 

b) lives of persons - as a form of writing 

c) collective ;/ individual - ad hoc " 
df) as a form f/ of writing - ad hoc * 
e) none, of the above' 1 ' * 

*. 

. • " V ** * 

The field^f - literature has two classes qf materials which must 
be distinguished carefully. They are • , 

a) belles-lettres - collections 

b) collections - works about literature 

°) belles-lettres - individual literature ^ 

d) . wark about "literature - belles lettres „ V 

e) n'one-.of these . . >»w 



There are several uses for a shelf list. WhicK of the following 
is^ot one of them? . * ; 

a) protection against duplication of -a cail number 

b) 'buying guide . 

c) , inventory control * 

d) - 'record of achievement 
<e) aid in classification 

* 

What constitutes a "set of cards?" , . ^ 

« 

n entry card, plus one card for each tracing 



— t 



'a) 


mai n 


entry 


py 


-ma i n 


entry 




list 




c) 


ma i n 


e:n try' 


d) 


no ne 


of the 


The 


LC Subject v 


can 


al so 


serve 



a) quasi-relative index 'to LC 

b) a finding device . * 

c) a* name file 

d) ' none of these _ 

26 



12; Wha-t* edition j.s unabridged DC noy in?' 



a} 14th " 
b) , 17th. 5 




20th 

*e) none of the. above 



J.3. 



14.. 



15, 



•LC traces itself back to v*hich' great philosopher? 



a) 
b) 
.c) 
d) 
e) 



Aris totl e ' 
Plato 
Spencer 
Cutter 

none of the above 



You have been usijig, LC Class Fine Arts a text; Which of 

the following subjects are included in DC 100' s, but not in LC 
Cl*ass N? > , 

a) Music " ' K 

b) Photography * ' * '* * 

s) Graphic arts . ' 

d) , .a, b * 

e ) ,a , b , c * ~ * * * * 




What^se^ms' t;o 
if i cat ion s cheme ? 



be a major difficulty with use^of almost any cla£s« 
me? , ♦ % 



overlapping of sutig<*ct areas 
no real major difficulty 
language problems "* 
none of the above 



In the tracings on a card, subject entries: 

r 

precede other axided entries * 
go behind Roman numerals 
follow the other added entries ^ 
.(a)' and (b) ♦ 
none of the above , 

17. What is the purpose of a See also reference? 

■ - 



1. ? 



a) 
>) 

/ c) 
d) 
* e) 



-To direct a reader from a non-used heading to a usedTheading 
To provide, historical kinds of information for the useTr of the 
card catalog * s • 

Both (a)f'and (b) c * - 4 ^ " 

To direct a user 'to material related to the'heading consulted' 
None of the above 



\ 



18. * ^What is the purpose of a. Uniform Title? \ • % - - 

* a) To make added wcrr-k f or- a • ca.taloger . V ^ 

y b) Tdsbring together all catalog eatrieS for' 3, given work for 

; which various editions, trans la tions , 'e^:c • have various title, 

. c). 'To provide ek method for standardizing title entries 

.* , d) None, of the above " / , 1 " ' h m S 

^ 9 * • x 

19. What 'is the entry" work for the Holy Bible? 

a) The Bible * . * - 

b) The Holy Bible . , 

c) Under name of translator ' 

d) Bible • , , 

e) None jof the above 

' . V ' ' » "" 

20. Generally speaking, when two corporate ^bodies have t*h*e* same name, 
but are located, in different places, then;, * C 

^a) some' arbitrary device is used to distinguish between 'them 

JdJ the name of the place is added * /\ 

; c)^ one is entered under place; the ather is entered under its 

cprporate name* tl 1 * \ 

f d) hopefully such wJLll not happen \ 
e) none of the above. 



21. Whatsis the primary importance of MARC?" 

a) large d^^a base ^ ^ 

b) * communications device • m ,* 

c) networking device * 

d) standardization 
% f e) none of £he above 



22. How are MSS .housed? 
f ' *' 

a) In boxes \ ? 

b) on the regular shelves 

c) ~ In acid^free mani-la* foiders 
<d) none of* these t 



I KJC 



.23, What is the major difficulty in est abli^hing t Chinese personal nalnes? 

a) Language impossible % . . , - * • - * 

b) one person , may have .several .names which'are all legitimate to use 

c) notfe*. . ^ , - . 

d) few *£ecords avai¥abl,er w ' , 

a) none We the above*-- therfe is one, but it isnit 'listed above 

r ' . * 

■ • ■ 28 . 



- 4< 



2*4 . — What axe Jiolographic 'manuscripts'?, 



a) 


'typewritten 




b) 


pr in ted 






handwritten 




d) . 


dittoed 






none of these' ^ 





25. Day-books, journals, 'diaries would be «a tegori sec^ 

a) printers* 1 copy , - '„ , *• '"•'«* 

b) * MSS ,writte-n before the invention of printing 

c) au tji or 1 s first d rafts, 

d) correspondence not written for publication 

• • eY private papers * * 



■ m 



.c. 




/ 



'25 

-3U 



. # SOl>RCES CONSULTED 



1 . ' Books - % - • ' /' 

— - ; P 

.* • " * 

Lippey, Geral.d, ed. Computer 0\ss i sted Test Construction . 
•Englewood Cliffs', New Jersey: Educational Technology' 
Publications, 1974. . . ; ' 

2. Periodical s " 

Penney, C. "Th&re is More to a Test Poo-1- than Data ColTectio 
Educational- Technology . 13 (1 973) x 19-20. 

t <r 

Toggenburger, Frank. "Classroom Teacher Support System," 
Educationa,!. Techno! og,y , 13 { 1 973) ,. 42-43. 



Epstein, M % 6. "Computer Assisted Assembly of-, Tests at * 
Educational ^Tes-t-i rrg Service," 'Educationa l Technology, 
« 13*-(1973), 23-24. ~ " \ 7~- ^ 



a 



3; - Conference Proceedings .and Papers * • 

* * *- 

Baker, Frank B.'~ "A Con.Versati onal - 1 te'm Banking and Test 

Construction System, ". Proceedings of -the? 1972 Fall' Joint - 
Computer Conference , 41 (1972), 661-667. ■ _ - 

Prosser, -Frank, and Donald D. Jensen. "Computer. Generated 
Repeatable Tests"," Proceedings of -the" 1971 Spring Joint 
Computer^€)Qnferencfi , 38 ( 1 971 K '295-301 . " > 

Wo-od, Lewis J. '"Establishing a CATC System: Where to 
• Begin." Paper. read at the Second Annual Conference on 
Computer Assisted ^T'est Construct!* on At! anta , Georgia,' ' 
Octdber 13,, 1975. ' (Proceedings yet to be publ i s he.cLl). 

..... • -. . • r / 

4. .Unpublished Materials 

Wo\d, Lewis J. "Preliminary Analysis of Costs and Revenues 
for M^dular^Testing." Provo, Utah: Bri gham Young/Uni ver 
sity Department af Instructional Evaluation and Testing, 

' ----- _ 4 

"Request for- Approval - of Computer Hardware • 

Procurement." Provo, Utah: Bri gham ¥ou.ng University' 
department -of Instructional Evaluation and Testi ng , ; 1 975. 

.. . _ . 38 . • . ■- 



